Cross-Granularity Graph Inference for Semantic Video Object Segmentation
نویسندگان
چکیده
We address semantic video object segmentation via a novel cross-granularity hierarchical graphical model to integrate tracklet and object proposal reasoning with superpixel labeling. Tracklet characterizes varying spatial-temporal relations of video object which, however, quite often suffers from sporadic local outliers. In order to acquire highquality tracklets, we propose a transductive inference model which is capable of calibrating shortrange noisy object tracklets with respect to longrange dependencies and high-level context cues. In the center of this work lies a new paradigm of semantic video object segmentation beyond modeling appearance and motion of objects locally, where the semantic label is inferred by jointly exploiting multi-scale contextual information and spatialtemporal relations of video object. We evaluate our method on two popular semantic video object segmentation benchmarks and demonstrate that it advances the state-of-the-art by achieving superior accuracy performance than other leading methods.
منابع مشابه
Beyond Semantic Image Segmentation : Exploring Efficient Inference in Video
Deep convolutional neural networks (DCNNs) trained on a large number of images with pixel-level annotations or a combination of strongly labeled and weakly-labeled images have recently been the state-of-the-art in semantic image segmentation with significant performance improvement. However, due to the very invariance properties that make DCNNs good for high level tasks such as classification, ...
متن کاملRecursive Inference for Prediction of Objects in Urban Environments
Future advancements in robotic navigation and mapping rest to a large extent on robust, efficient and more advanced semantic understanding of the surrounding environment. The existing semantic mapping approaches typically consider small number of semantic categories, require complex inference or large number of training examples to achieve desirable performance. In the proposed work we present ...
متن کاملMultimedia Content Understanding : Bringing Context to Content
In this paper, we propose a framework to extendsemantic labeling of images to video shot sequences and achieveefficient and semantic-aware spatiotemporal video segmentation.This task faces two major challenges, namely the temporal vari-ations within a video sequence which affect image segmentationand labeling, and the computational cost of region labeling.Guided by these...
متن کاملImproving Semantic Video Segmentation by Dynamic Scene Integration
Multi-class image segmentation and pixel-level labeling of the frames that make up a video could be made more efficient by incorporating temporal information. Recently, Convolutional Neural Networks (ConvNets) have made an impressive positive impact on the single image segmentation problem. In this paper, in order to further increase labeling accuracy, we propose a method for integrating short-...
متن کاملHierarchical semantic segmentation of image scene with object labeling
Semantic segmentation of an image scene provides semantic information of image regions while less information of objects. In this paper, we propose a method of hierarchical semantic segmentation, including scene level and object level, which aims at labeling both scene regions and objects in an image. In the scene level, we use a feature-based MRF model to recognize the scene categories. The ra...
متن کامل